Skip to content

nicolas-ragnell: Implemented a CUDA matrix multiplication#17

Open
nragnar wants to merge 1 commit into
parallelcomputingabo:mainfrom
nragnar:nicolas-ragnell
Open

nicolas-ragnell: Implemented a CUDA matrix multiplication#17
nragnar wants to merge 1 commit into
parallelcomputingabo:mainfrom
nragnar:nicolas-ragnell

Conversation

@nragnar
Copy link
Copy Markdown

@nragnar nragnar commented May 31, 2025

  • I implemented Assignment 3 using the Mahti supercomputer. Using Windows, I had trouble connecting, and it took a long time to get it working. Using Mahti with VS Code was easy once I got in.

  • At first, I read the .raw files incorrectly a few times, which gave me strange numbers and gave me array length errors, or 0.0 as output, but I had just made some obvious mistakes that I later fixed and got the correct output.

  • The mahti job queue was also quite long at times, and I had to wait 30 minutes to get my results, but on Saturday, it was almost instant.

  • I tested the tiled CUDA matrix multiplication using a tile width of 16 and 32, and I did not notice any significant change in speed. - Other than that, no major difficulties.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant